105 research outputs found

    A Study on Prevention of Non-Performing Assets of Chinese State-Owned Commercial Banks

    Get PDF
    For a long time, Chinese state-owned commercial banks have to face the actualities of large non-performing assets and a high rate of non-performing assets. Based on China’s national condition and referring to the US banking industry, this article makes proposals on preventing and controlling non-performing assets from four aspects: state-owned enterprise system, government regulation, credit risk management, and disposal of non-performing assets

    Event-triggered communication for passivity and synchronisation of multi-weighted coupled neural networks with and without parameter uncertainties

    Get PDF
    A multi-weighted coupled neural networks (MWCNNs) model with event-triggered communication is studied here. On the one hand, the passivity of the presented network model is studied by utilising Lyapunov stability theory and some inequality techniques, and a synchronisation criterion based on the obtained output-strict passivity condition of MWCNNs with eventtriggered communication is derived. On the other hand, some robust passivity and robust synchronisation criteria based on output-strict passivity of the proposed network with uncertain parameters are presented. At last, two numerical examples are provided to testify the effectiveness of the output-strict passivity and robust synchronisation results

    Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis

    Full text link
    In this paper, we propose a novel dual-branch Transformation-Synthesis network (TS-Net), for video motion retargeting. Given one subject video and one driving video, TS-Net can produce a new plausible video with the subject appearance of the subject video and motion pattern of the driving video. TS-Net consists of a warp-based transformation branch and a warp-free synthesis branch. The novel design of dual branches combines the strengths of deformation-grid-based transformation and warp-free generation for better identity preservation and robustness to occlusion in the synthesized videos. A mask-aware similarity module is further introduced to the transformation branch to reduce computational overhead. Experimental results on face and dance datasets show that TS-Net achieves better performance in video motion retargeting than several state-of-the-art models as well as its single-branch variants. Our code is available at https://github.com/nihaomiao/WACV23_TSNet.Comment: WACV 202

    FakeLocator: Robust Localization of GAN-Based Face Manipulations

    Full text link
    Full face synthesis and partial face manipulation by virtue of the generative adversarial networks (GANs) and its variants have raised wide public concerns. In the multi-media forensics area, detecting and ultimately locating the image forgery has become an imperative task. In this work, we investigate the architecture of existing GAN-based face manipulation methods and observe that the imperfection of upsampling methods therewithin could be served as an important asset for GAN-synthesized fake image detection and forgery localization. Based on this basic observation, we have proposed a novel approach, termed FakeLocator, to obtain high localization accuracy, at full resolution, on manipulated facial images. To the best of our knowledge, this is the very first attempt to solve the GAN-based fake localization problem with a gray-scale fakeness map that preserves more information of fake regions. To improve the universality of FakeLocator across multifarious facial attributes, we introduce an attention mechanism to guide the training of the model. To improve the universality of FakeLocator across different DeepFake methods, we propose partial data augmentation and single sample clustering on the training images. Experimental results on popular FaceForensics++, DFFD datasets and seven different state-of-the-art GAN-based face generation methods have shown the effectiveness of our method. Compared with the baselines, our method performs better on various metrics. Moreover, the proposed method is robust against various real-world facial image degradations such as JPEG compression, low-resolution, noise, and blur.Comment: 16 pages, accepted to IEEE Transactions on Information Forensics and Securit

    Protect Federated Learning Against Backdoor Attacks via Data-Free Trigger Generation

    Full text link
    As a distributed machine learning paradigm, Federated Learning (FL) enables large-scale clients to collaboratively train a model without sharing their raw data. However, due to the lack of data auditing for untrusted clients, FL is vulnerable to poisoning attacks, especially backdoor attacks. By using poisoned data for local training or directly changing the model parameters, attackers can easily inject backdoors into the model, which can trigger the model to make misclassification of targeted patterns in images. To address these issues, we propose a novel data-free trigger-generation-based defense approach based on the two characteristics of backdoor attacks: i) triggers are learned faster than normal knowledge, and ii) trigger patterns have a greater effect on image classification than normal class patterns. Our approach generates the images with newly learned knowledge by identifying the differences between the old and new global models, and filters trigger images by evaluating the effect of these generated images. By using these trigger images, our approach eliminates poisoned models to ensure the updated global model is benign. Comprehensive experiments demonstrate that our approach can defend against almost all the existing types of backdoor attacks and outperform all the seven state-of-the-art defense methods with both IID and non-IID scenarios. Especially, our approach can successfully defend against the backdoor attack even when 80\% of the clients are malicious

    GitFL: Adaptive Asynchronous Federated Learning using Version Control

    Full text link
    As a promising distributed machine learning paradigm that enables collaborative training without compromising data privacy, Federated Learning (FL) has been increasingly used in AIoT (Artificial Intelligence of Things) design. However, due to the lack of efficient management of straggling devices, existing FL methods greatly suffer from the problems of low inference accuracy and long training time. Things become even worse when taking various uncertain factors (e.g., network delays, performance variances caused by process variation) existing in AIoT scenarios into account. To address this issue, this paper proposes a novel asynchronous FL framework named GitFL, whose implementation is inspired by the famous version control system Git. Unlike traditional FL, the cloud server of GitFL maintains a master model (i.e., the global model) together with a set of branch models indicating the trained local models committed by selected devices, where the master model is updated based on both all the pushed branch models and their version information, and only the branch models after the pull operation are dispatched to devices. By using our proposed Reinforcement Learning (RL)-based device selection mechanism, a pulled branch model with an older version will be more likely to be dispatched to a faster and less frequently selected device for the next round of local training. In this way, GitFL enables both effective control of model staleness and adaptive load balance of versioned models among straggling devices, thus avoiding the performance deterioration. Comprehensive experimental results on well-known models and datasets show that, compared with state-of-the-art asynchronous FL methods, GitFL can achieve up to 2.64X training acceleration and 7.88% inference accuracy improvements in various uncertain scenarios

    Towards Better Fairness-Utility Trade-off: A Comprehensive Measurement-Based Reinforcement Learning Framework

    Full text link
    Machine learning is widely used to make decisions with societal impact such as bank loan approving, criminal sentencing, and resume filtering. How to ensure its fairness while maintaining utility is a challenging but crucial issue. Fairness is a complex and context-dependent concept with over 70 different measurement metrics. Since existing regulations are often vague in terms of which metric to use and different organizations may prefer different fairness metrics, it is important to have means of improving fairness comprehensively. Existing mitigation techniques often target at one specific fairness metric and have limitations in improving multiple notions of fairness simultaneously. In this work, we propose CFU (Comprehensive Fairness-Utility), a reinforcement learning-based framework, to efficiently improve the fairness-utility trade-off in machine learning classifiers. A comprehensive measurement that can simultaneously consider multiple fairness notions as well as utility is established, and new metrics are proposed based on an in-depth analysis of the relationship between different fairness metrics. The reward function of CFU is constructed with comprehensive measurement and new metrics. We conduct extensive experiments to evaluate CFU on 6 tasks, 3 machine learning models, and 15 fairness-utility measurements. The results demonstrate that CFU can improve the classifier on multiple fairness metrics without sacrificing its utility. It outperforms all state-of-the-art techniques and has witnessed a 37.5% improvement on average

    LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation

    Full text link
    Gestures are non-verbal but important behaviors accompanying people's speech. While previous methods are able to generate speech rhythm-synchronized gestures, the semantic context of the speech is generally lacking in the gesticulations. Although semantic gestures do not occur very regularly in human speech, they are indeed the key for the audience to understand the speech context in a more immersive environment. Hence, we introduce LivelySpeaker, a framework that realizes semantics-aware co-speech gesture generation and offers several control handles. In particular, our method decouples the task into two stages: script-based gesture generation and audio-guided rhythm refinement. Specifically, the script-based gesture generation leverages the pre-trained CLIP text embeddings as the guidance for generating gestures that are highly semantically aligned with the script. Then, we devise a simple but effective diffusion-based gesture generation backbone simply using pure MLPs, that is conditioned on only audio signals and learns to gesticulate with realistic motions. We utilize such powerful prior to rhyme the script-guided gestures with the audio signals, notably in a zero-shot setting. Our novel two-stage generation framework also enables several applications, such as changing the gesticulation style, editing the co-speech gestures via textual prompting, and controlling the semantic awareness and rhythm alignment with guided diffusion. Extensive experiments demonstrate the advantages of the proposed framework over competing methods. In addition, our core diffusion-based generative model also achieves state-of-the-art performance on two benchmarks. The code and model will be released to facilitate future research.Comment: Accepted by ICCV 202

    Form-NLU: Dataset for the Form Language Understanding

    Full text link
    Compared to general document analysis tasks, form document structure understanding and retrieval are challenging. Form documents are typically made by two types of authors; A form designer, who develops the form structure and keys, and a form user, who fills out form values based on the provided keys. Hence, the form values may not be aligned with the form designer's intention (structure and keys) if a form user gets confused. In this paper, we introduce Form-NLU, the first novel dataset for form structure understanding and its key and value information extraction, interpreting the form designer's intent and the alignment of user-written value on it. It consists of 857 form images, 6k form keys and values, and 4k table keys and values. Our dataset also includes three form types: digital, printed, and handwritten, which cover diverse form appearances and layouts. We propose a robust positional and logical relation-based form key-value information extraction framework. Using this dataset, Form-NLU, we first examine strong object detection models for the form layout understanding, then evaluate the key information extraction task on the dataset, providing fine-grained results for different types of forms and keys. Furthermore, we examine it with the off-the-shelf pdf layout extraction tool and prove its feasibility in real-world cases.Comment: Accepted by SIGIR 202
    corecore